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OLIGONUCLEOTIDE ARRAY ANO METHODS OF USE 



The present invention is directed to an oligonucleotide array ('generic' array) 
consisting of a plurality of different oligonucleotides of predetermined sequence, attached to a 
5 solid surface at predetermined positionally distinct locations, characterised in that the 

oligonucleotides have substantially the same melting temperature (T m ). The invention is also 
directed to probe and primer targeting polynucleotides which contain nucleotide sequences 
that that are complementary to different target nucleic acids from a test sample as well as to 
the oligonucleotides on the array. The 'generic' array and targeting oligonucleotides prove 

10 useful in a number of hybridisation capture methods and assays such as, sequence 

identification by allele specific hybridisation, fingerprinting and genome typing, differential 
gene expression profile analysis, or in clinical diagnostic methods utilising for example, 
ARMS amplification primers. The present invention is therefore also directed to methods or 
assays utilising such 'generic' arrays and, in particular, to the use of a multiplex amplification 

15 assay that utilises a plurality of ARMS primers possessing non-amplifiable tails in 
conjunction with a 'generic' array for detection amplification of variant nucleic acid 
sequences. 

Microarray (also termed hybridisation array, gene array or gene chip) technology 
wherein nucleic acid molecules attached to solid substrates at predefined locations in small 
20 areas and at high density are used, in conjunction with hybridisation reactions, for identifying 
and discriminating target nucleic acid sequences, has advanced rapidly in the past few years. 
These chips or microarrays allow massive parallel data acquisition and are used, for example, 
in polymorphism detection, clinical mutation detection, expression monitoring, fingerprinting 
and sequencing. 

25 A variety of methods are currently available for making arrays of biological 

molecules. The 'dot or slot blot' approach, whereby an ordered array of DNA is vacuum 
blotted using a manifold, or hand blotted by capillary action, onto a porous membrane, such as 
nylon or nitrocellulose has been around for many years (Maniatis et al., Molecular Cloning-A 
Laboratory Manual, First Edition, Cold Spring Harbor, 1982). Methods for preparing a 

30 plurality of oligonucleotide sequences and for attaching these to solid supports at high density 
are also known in the art. For example, US Patent No. 4.562, 157 describes a method of using 
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photo-activatable cross-linking groups to immobilise pre-synthesised ligands on surfaces. 
Fodor et al. (Nature. 364:555-556, 1993) and US Patent No. 5,143,854 describe the Might- 
directed chemical synthesis' method for synthesising ligands, including oligonucleotides, 
directly onto a substrate surface at the desired location. US 5,700,637 also describes methods 
5 for in situ synthesis of oligonucleotides on solid support surfaces. In addition, such methods 
for preparing microarrays can easily be automated. International Publication No. WO 
95/35505 discloses an automated capillary dispensing device and method for applying 
biological macromolecules to solid supports. International Publication No. WO 97/44134 
also describes devices for delivery of small volumes of liquid (which may contain biological 
10 macromolecules) in a precise manner to produce microsized spots'on a solid surface to 

generate a microarray. Similarly, International Publication No. WO 98/10858 also describes 
an apparatus for the automated synthesis of molecular arrays. Techniques exist for applying 
the oligonucleotides to the array at high density and for example, techniques exist for 
applying well in excess of 103 distinct polynucleotides per 1 cm2. 
15 Many of the advances in microarray technology concern increasing miniaturisation. 

Aside from the ease in handling and manipulating smaller hybridisation matrices, one 
significant advantage that smaller chips with higher density of capture probe have over larger 
formats is that the sample does not have to be "stretched out". The technology is unlikely to 
be widely accepted in the clinical diagnostic market however until costs have been 
20 substantially reduced from their current levels. 

Oligonucleotide DNA arrays consisting of short oligonucleotides (e.g. typically 8mers 
to 20 mers) bound via their 3' termini to a solid surface such as glass or a silicon wafer have 
been proposed as tools for mutation detection or for the resequencing of genes (e.g. Chee et. 
al., Science, 1 996, Vol. 274, pp 6 1 0 - 6 1 4; Drobyshev et. al., Gene, 1 997, vol. 1 88, pp 45-52). 
25 The mechanism of analysing DNA sequences depends on the principles of allele specific 
hybridisation. In brief, an oligonucleotide array is prepared with a set of overlapping 
oligonucleotide probes (e.g. 20 mers) complementary to the consensus sequence of the DNA 
target designed so that each sequence is offset from the previous sequence by one base pair. 
As well as the consensus sequence all three variant sequences at the central nucleotide 
30 position are also included on the array. Often the probe sequences for analysing the opposite 
template strand are also included on the array. The target sequence is amplified using the 
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polymerase chain reaction (PCR) and the products are then often transcribed into RNA which 
has incorporated therein a fluorescent label such as fluorescein-UTP. The labelled RNA is 
sheared into short fragments and then hybridised to the DNA array. The most stable hybrids 
form when the target sequence binds to its fully complementary sequence on the array. If 
5 there is a mutation in the target sequence then it will bind to one of the variant probes more 
effectively than to the consensus probe. The relative extent of binding of the target sequence 
to the probes is measured by monitoring the intensity of fluorescence at each site on the DNA 
array. If the target sequence is mutated then the fluorescence intensity will be greater at one 
of the variant probe positions than at the consensus probe position. 
10 Microarray technology also makes it possible to simultaneously study the expression 

of many thousands of genes in a single experiment. Differential expression profiles from, for 
example, normal versus diseased tissues or induced versus un-induced tissues can be obtained 
by hybridising the product of expressed mRNA to complementary nucleic acid at pre-defined 
locations on the array. Alternatively, a time-course of expression of thousands of genes over 
15 several experiments from a single sample could be performed. Analysis of gene expression in 
human tissue (i.e. biopsy tissue) can assist in the diagnosis and prognosis of disease and the 
evaluation of risk for disease. A comparison of levels of expression of various genes from 
patients with defined pathological disease conditions with normal patients enables an 
expression profile, characteristic of disease, to be created. There are currently two approaches 
20 to analyse gene expression using microarrays. In the first approach, cDNA fragments, often 
generated by PCR, for each of the genes under study are attached to an array. Typically, ' 
mRNA isolated from the test samples (i.e. induced or un-induced) is reverse transcribed into 
cDNA with incorporation of a fluorescent label. The cDNA is sheared and hybridised to the 
array. The other test sample mRNA can be reverse transcribed with incorporation of a 
25 different fluorescent label to enable direct comparison of the expression level of each test gene 
on the same array (see WO 95/35505). The second approach is similar to the first except that 
an oligonucleotide microarray is used. Because of the differences in hybridisation properties 
between short oligonucleotide probes, each gene must be represented by several 
oligonucleotides (typically 20 or more) on the chip. In addition, a partner control 
30 oligonucleotide identical to each oligonucleotide, except for one of the central nucleotides, is 
included on the array to serve as an internal control for hybridisation sensitivity. Thus, 
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whereas cDNA arrays only require each gene to be represented by a single hybridisation 
partner on the array, with oligonucleotide arrays, each test gene must be represented by 
approximately 40 distinct oligonucleotides each at a different position on the array. The 
advantage of oligonucleotide arrays over cDNA arrays however, concerns the shelf-life of the 
5 sample on the array. In general, a cDNA library prepared on an array is useable for weeks 
whereas, pre-prepared oligonucleotide arrays can be stored for considerably longer. 

The strengths of the DNA microarray concept is its ability to carry out very large 
numbers of hybridisation based analyses simultaneously. However, as the capture sequences 
attached to the support (chip) have to complement the target sequence, knowledge of the 
10 target sequence is required. Each chip has to be custom built on. the basis of this known 
sequence. The need to develop a new custom chip for each new test renders the technology 
costly and complex. Other concerns involve the hybridisation conditions that must be adopted 
for each test. Secondary and tertiary structure formation can interfere with hybridisation of 
the capture and target molecules. In addition, duplex formation between different individual 
15 pairs of target sequence and capture sequence may have different stabilities (melting 
temperatures), because of different G-C content, for example. Current approaches to 
overcome some of these hybridisation problems include: applying parallel hybridisation 
across the array, altering the concentration of capture nucleic acid at a particular location, 
modifying the length of the oligonucleotide at a particular location so as to alter duplex 
20 stability, and using tuned electric fields as demonstrated by Edman et al, (Nucleic Acids 

Research. 25(24):4907-4914, 1997. In practice, however, as DNA duplexes between different 
individual sequences on the arrays and their cognate complementary target sequences have 
various different stabilities, custom hybridisation conditions have to be employed for each 
particular test. In view of the different hybridisation stabilities however, hybridisation 
25 conditions adopted are generally not optimal for each and every capture sequence on the 
array, but are generally a compromise. It would be advantageous to design a microarray 
wherein substantially the same hybridisation between each pair of target and capture molecule 
occurs under any chosen hybridisation conditions. 

According to a first aspect of the invention there is provided a solid support having 
30 immobilised thereon a plurality of oligonucleotides at pre-defined positionally distinct sites, 
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characterised in that the sequence of each oligonucleotide that binds to its complementary 
sequence has substantially the same melting temperature (TJ. 

According to one aspect of the present invention there is provided a solid support 
having immobilised thereon a plurality of pre-selected oligonucleotides at pre-defined 
5 positionally distinct sites, characterised in that the sequence of each oligonucleotide when 
bound to its complementary sequence has substantially the same melting temperature (T m ) as 
the other oligonucleotides on the support. 

In a preferred embodiment the oligonucleotides attached to the solid support are non- 
complementary with genomic DNA and non complementary with each other. 
10 With regard to the meaning of "substantially the same melting temperature (T m ). M , 

each single-stranded oligonucleotide immobilised on the solid support has, in increasing order 
of preference, a T m when bound to its complementary sequence, within 8°C, 7°C, 6°C, 5°C, 
4°C, 3°C, 2°C, 1°C, and 0.5°C of the average T m of all the oligonucleotides immobilised on 
the solid support. In a more preferred embodiment the oligonucleotides immobilised on the 
15 solid support will possess melting temperatures within a range of 0 to 8°C of each other. In 
another preferred embodiment at least 90% of the oligonucleotides on the array have melting 
temperatures within 4°C, preferably within 2°C of each other. In an even more preferred 
embodiment the oligonucleotides immobilised on the solid support will possess melting 
temperatures within a range of 0 to 2°C of each other. In the most preferred embodiment, 90- 
20 100% of all the oligonucleotides immobilised on the solid support will possess the same, 
melting temperature, and the remaining oligonucleotides will preferably possess melting^ 
temperatures within a range of 0 to 2°C of this mode value. 

Although it is preferred that all of the oligonucleotides on the solid support fall within 
the ranges or values for melting temperature as defined above, it is envisaged that a small 
25 number of oligonucleotides preferably less than 1-5%, more preferably less than 2% of the 
total number of oligonucleotides may fall outside these ranges or values. 

The melting temperature (T m ) referred to herein, is defined as the temperature at 
which duplex DNA exists in a ratio of 50:50 in hybridised and dissociated form under 
equilibrium conditions. The principal governing factors determining T m are sequence length 
30 and G-C content. The theoretical and experimental procedure for determining the T m is 
disclosed in Molecular Clomng-A Laboratory Manual, Second Edition, J Sambrook et al., 
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Cold Spring Harbor, Chapter 1 1 section 46 and 55. In essence, for oligonucleotides shorter 
than 18 nucleotides, the T m of the hybrid is estimated by multiplying the number of A + T 
residues in the hybrid by 2°C and the number of G + C residues by 4°C and adding the two 
together. For oligonucleotides between approximately 14 and 70 nucleotides in length, the 
5 following equation devised by Bolton and McCarthy, (P.N.A.S. 48: 1390, 1962) for 
determining T m of long DNA molecules is also applicable: 

T m = 81.5 - 16.6(logi 0 [Na+]) + 0.4 1(% G + C) - (600/N). 
Wherein N = chain length and [Na+] is the ionic strength of the hybridisation solution. 

The term "nucleotide" as used herein can refer to nucleotides present in either DNA or 
10 RNA and thus includes nucleotides which incorporate adenine, cytosine, guanine, thymine 
and uracil as base, the sugar moiety being deoxyribose or ribose. It will be appreciated 
however that other modified bases capable of base pairing with one of the conventional bases, 
adenine, cytosine, guanine, thymine and uracil, may be used in the oligonucleotides, probes or 
primers employed in the present invention. Such modified bases include for example 
15 8-azaguanine and hypoxanthine. 

The term "oligonucleotide" as used herein is defined as a molecule comprised of two 
or more nucleotides (i.e. deoxyribonucleotides or ribonucleotides), preferably more than five. 
Its exact size will depend on many factors, such as the reaction temperature, salt 
concentration, the presence of denaturants such as formamide, and the degree of 
20 complementarity with the sequence to which the oligonucleotide is intended to hybridise. , 
In operation, under any hybridisation conditions adopted, all of the oligonucleotides 
on the solid support (herein referred to as "capture oligonucleotides") that are to be used for 
capture of a target sequence, exhibit approximately the same hybridisation stability with their 
cognate complementary sequence as the other pairs of oligonucleotide and complementary 
25 target sequence. This ensures that approximately equivalent amounts of target DNA are 
bound to the complementary oligonucleotide on the array at any particular time, facilitating 
quantitative analysis. As the T m of all the duplexed oligonucleotides will be substantially the 
same however, the optimum temperature for hybridisation can be adopted. With 
oligonucleotide hybridisation, the optimum hybridisation temperature is generally carried out 
30 under conditions that are 5-1 0°C below the T m , with the hybridisation and subsequent washes 
carried out under stringent conditions. Ideally, the hybridisation temperature is controlled 
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precisely, preferably to ±2°C , more preferably to ±0.5°C or better, particularly when the 
hybridisable length of the capture oligonucleotides are small and there is a need to 
discriminate between two sequences that may only differ by a single nucleotide at one or other 
of the termini of the hybridisable sequence. It will be apparent that the capture 
5 oligonucleotides need not all be of the same length. Depending on their relative G-C content, 
two oligonucleotides of different lengths may nevertheless, have the same T m . Capture 
oligonucleotides of substantially the same length and G-C content are preferred however. 

According to a further aspect of the invention there is provided a solid support having 
immobilised thereon a plurality of pre-selected oligonucleotides at pre-defined sites, 
10 characterised in that the capture portion of all of the oligonucleotides are of substantially the 
same length and they all have substantially the same G-C content. 

With regard to the meaning of "substantially the same length", in increasing order of 
preference, the length of each capture portion of the oligonucleotide immobilised on the solid 
support will be within or equal tol6, 12, 10, 8, 6, 5, 4, 3, 2, and 1 nucleotide(s) of the average 
15 length of all the capture portions of the oligonucleotides immobilised on the solid support. In 
a preferred embodiment the oligonucleotide capture portions will each be of a length that is 
within 0-8 nucleotides of each other. 

With regard to the meaning of "substantially the same G-C content", in increasing 
order of preference, each of the oligonucleotides immobilised on the solid support will have a 
20 G-C content within or equal to 25%, 20%, 15%, 10%, 10%, 5% and 2% of the average G-C 
content of all the immobilised oligonucleotides. In a preferred embodiment, 95-100% of all 
the oligonucleotides immobilised on the support will have a percentage G-C content within 
10% of the median value and the remainder will preferably be within 25% of the median 
value. More preferably the percentage G-C content of the oligonucleotides immobilised on 
25 the solid support will be within 8% of each other. 

In a preferred embodiment the capture portion of all of the oligonucleotides are of the 
same length and have the same G-C content. 

Although it is preferred that all of the oligonucleotides on the solid support fall within 
the ranges for length and G-C content as defined above, it is envisaged that a small number of 
30 oligonucleotides preferably less than 1-5%, more preferably less than 2% of the total number 
of oligonucleotides may fall outside these ranges. The presence of oligonucleotides on the 
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solid support that possess capture sequences that fall outside the preferred sequence 
composition (i.e. length and G-C content) need not diminish the utility of the generic array 
particularly if these capture sequences are excluded from being used as capture molecules 
when the array is in use. 

5 In a further preferred embodiment there at least 50 different oligonucleotides 

immobilised on the solid support, and in increasing order of preference, there are at least 100, 
500, 1000, 5000, 10000 or more different oligonucleotides immobilised on the solid support. 
In the most preferred embodiment, there are between 50 and 500 different oligonucleotides 
immobilised on the solid support. 
10 In a further embodiment, the oligonucleotides are immobilised on the solid support at 

a density in the range of, in increasing order of preference, 1 to 1 000 per cm2, 200 to 1 000 per 
cm2, 200 to 500 per cm2, 1 to 200 per cm2, 1 to 50 per cm2, and 1 to 10 per cm2. Most 
preferred is about 100 per cm2. In a particular embodiment each distinct capture 
oligonucleotide is immobilised to the base of a well of a conventional microtitre plate. 

Although it is preferred that the oligonucleotides at each pre-determined location on 
the solid support are unique, this is not essential. Duplicate, triplicate etc., representation of 
one, more or all capture oligonucleotides on the solid support (array) may be desired in order 
to detect replicate values. 

The capture oligonucleotides attached to the solid support generally have a 
20 hybridisable sequence between 5 and 100 nucleotides in length. The preferred length of, 
hybridisable sequence is in the range of 10 - 50, more preferred is 20 -35, still more preferred 
is 12-30. The hybridisable sequence is that portion of the capture oligonucleotide (capture 
portion) that is designed and available for hybrid formation with its complementary sequence. 
As used herein in reference to hybridisable sequence and capture portion are used 
25 interchangeably. Non-hybridisable sequence of the capture oligonucleotide might represent 
flanking or tether sequences. Tether sequences not only serve to anchor the oligonucleotide to 
the solid support but also serve to distance the hybridisable portion of the capture 
oligonucleotide from the solid support to alleviate steric interference. 

The primary structure of each unique capture oligonucleotide on the array can either 
30 be designed manually, or can be designed using a computer program to generate random 
nucleotide sequences. A suitable macro for randomly designing oligonucleotides is disclosed 
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in example 2 herein. It is preferable that none of the capture oligonucleotides on the array 
(solid support) are capable of hybridising, under stringent hybridisation conditions adopted, to 
any of the sample target sequences. In this respect, it is preferred that none of the capture 
oligonucleotides are capable of hybridising with any part of the genome of the test organism 
5 under study, be this human, simian, bacterial, viral or the like. In a more preferred 
embodiment all of the capture oligonucleotides on the array are artificial and lack 
complementarity to any known sequence from whatever origin. It is also preferred that none 
of the capture oligonucleotides cross-hybridise to any test sample nucleic acid. However, if 
one or more of the capture oligonucleotides on the array do bind to a test sample nucleic acid, 



10 this is not detrimental provided that it is known in advance so that any false positive result can 
be discounted. 

The capture oligonucleotide molecules may be individually synthesised on a standard 
oligonucleotide synthesiser. These oligonucleotide (oligos) may then be attached to the 
substrate matrix by any of a variety of techniques known in the art such as by using 

15 photochemical reagents, such as disclosed in US Patent No. 4,542,102 and 4,713,326. US 
Patent No. 4,562,157 also describes a method of using photo-activatable cross-linking groups 
to immobilise pre-synthesised ligands on surfaces. Alternatively, the oligonucleotides can be 
synthesised directly onto the solid surface using photolithography techniques, such as 
disclosed in US Patent No. 5,143,854, or other methods such as disclosed in US Patent No. 

20 5,700,637, or International Publication No's: WO 95/35505, WO 97/44134 or WO 98/10958. 
Schena et ah (TIBTECH 16(7):301-306, 1998) reviews the recent advances in microarray 
technology including the various means of constructing these arrays. Problems facing current 
photolithographic techniques for oligonucleotide synthesis involve the low yield of synthesis 
at each synthesis step, and also the efficiency of nucleotide addition at each synthesis step 

25 which can range from about 80% to 97%, with purines generally having a lower efficiency 
than pyrimidines (Thomas & Burke. Exp. Opin. Ther. Patents 8(5):503-508, 1998). When 
constructing an array with relatively few capture oligonucleotides, say 500 or less, or with 
long oligonucleotides say 30-mers or more it may be preferable to synthesise the oligos 
separately and affix them to the solid support later rather than in situ synthesis. 

30 According to a further aspect of the present invention there is provided a method for 

the preparation of a generic oligonucleotide microarray of the invention, comprising 
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synthesising a plurality of different oligonucleotides and then affixing them to a solid support 
at a pre-defined location, wherein each oligonucleotide possesses substantially the same T m as 
the other oligonucleotides when annealed to its complementary sequence. In a preferred 
embodiment the oligonucleotides are synthesised on a standard oligonucleotide synthesiser 
5 such as an Applied Biosystems model 340A synthesiser. 

According to a further aspect of the invention there is provided a method for the 
preparation of a generic oligonucleotide microarray, comprising directly synthesising onto a 
solid support at pre-defined positions a plurality of different oligonucleotides, each 
oligonucleotide possessing substantially the same T m as the other oligonucleotides when 
10 annealed to its complementary sequence. In a preferred embodiment said synthesis is by the 
photolithography technique as described in US Patent No. 5,143,854. 

In order to avoid or alleviate steric factors during the capture hybridisation reaction, it 
may be desirable to use a tether/linker molecule to tether the capture oligonucleotides to the 
solid support. Shchepinov et al. (N.A.R. 25:1 155-1 161, 1997) disclose the use of various 
15 amino group-containing phosphoramidite moieties to distance the capture oligonucleotide 
from the solid support and thus alleviate steric interference. They found that with a linker of 
at least 40 atoms in length they obtained up to 1 50-fold increased hybridisation yields. Based 
on the teaching in Shchepinov et al., the person skilled in the art would be able to design and 
synthesise suitable tether/linker molecules to reduce steric interference of the support on 
20 hybridisation behaviour of the immobilised capture oligonucleotides of the invention. 

Thus, in a preferred embodiment the oligonucleotides are attached to the solid support 
via a tether molecule, such as disclosed in Shchepinov et al. (N.A.R. 25:1155-1161, 1997). 

The novel array with its population of unique capture oligonucleotides is generally 
used in conjunction with targeting polynucleotide molecules (herein referred to as "targeting 
25 polynucleotides"). 

The targeting polynucleotides are comprised of two adjacent oligonucleotide 
sequences, optionally separated by a spacer molecule. The first sequence of the targeting 
polynucleotide is complementary to one of the capture oligonucleotide sequences on the array. 
The second sequence is complementary to or substantially complementary to the target 
30 sequence to be detected and can therefore act as a detection probe or as an amplification 

primer. Although the targeting sequence (the second sequence) need not reflect (be precisely 
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complementary to) the exact sequence of the target, the more closely it does reflect the exact 
sequence the better the binding during the annealing process. 

The term "polynucleotide" as used herein is used to define a molecule composed often 
or more deoxyribonucleotides or ribonucleotides, preferably more than 25. A polynucleotide 
5 molecule may be made up of two or more oligonucleotide molecules. 

The term "complementary to" is used herein in relation to nucleotides to mean a 
nucleotide which will base pair with another specific nucleotide. Thus adenosine triphosphate 
is complementary to uridine triphosphate or thymidine triphosphate and guanosine 
triphosphate is complementary to cytidine triphosphate. It is appreciated that whilst 

10 thymidine triphosphate and guanosine triphosphate may base pair under certain circumstances 
they are not regarded as complementary for the purposes of this specification. It will also be 
appreciated that whilst cytosine triphosphate and adenosine triphosphate may base pair under 
certain circumstances they are not regarded as complementary for the purposes of this 
specification. The same applies to cytosine triphosphate and uracil triphosphate. 

15 "Precise complementarity" or "perfectly matched" as used herein, is in reference to the 

duplex that the poly- or oligonucleotide strands make with one another to form a double 
stranded structure such that every nucleotide in each strand undergoes Watson-Crick base 
pairing with a nucleotide on the other strand. The term also encompasses the pairing of 
nucleoside analogues, such as deoxinosine, nucleotides with 2-aminopurine bases, and the 

20 like, that may be employed. Conversely, a mismatch in a duplex fails to undergo Watson- 
Crick bonding. 

"Substantially complementary" as used herein, refers to poly- or oligonucleotide 
molecules (or strands) that, under suitable hybridisation conditions (i.e. with reduced 
stringency), have sufficient complementarity to specifically anneal together, i.e. to the 

25 exclusion of all other strands, to form a double stranded structure, but wherein one or other 
strand has, relative to its partner, a limited number of non-complementary (mismatched) 
nucleotides that are incapable of undergoing Watson-Crick base pairing with the 
corresponding nucleotide on the other (partner) strand. In a preferred embodiment the number 
of mismatch nucleotides does not exceed 20%, more preferably 15%, and still more preferably 

30 10% of the total number of nucleotides in the poly- or oligonucleotide. 
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The targeting polynucleotide molecules may be individually synthesised on a standard 
oligonucleotide synthesiser. 

When the targeting polynucleotides are for use as amplification primers, the 5' portion 
is preferably blocked from acting as template for the polymerase enzyme. This blocking can 
5 be effected by linking the 3' portion and the 5' portion of the polynucleotide in the opposite 
sense to one another, the 3' end being generally in the 5' -> 3' sense and the 5' end of the 
polynucleotide being in the 3' -» 5' sense, with linkage via their 5' termini. The presence of 
the nucleotide sequence of the 5' portion in the opposite orientation prevents the polymerase 
enzyme from making a fully double stranded amplification product. The 5' portion of the 
10 polynucleotide thus becomes a single stranded tail on the amplification product. This single 
stranded tail (the 5' end portion) can then be utilised for capture of the amplification product 
onto the complementary capture oligonucleotide attached to a solid support. The more 
preferred means of blocking the polymerisation agent however, is to incorporate a blocking 
moiety (as the spacer moiety) between the 5' portion and 3' portion of the targeting 
15 polynucleotide. 

The term " blocking moiety" as used herein means any moiety which when linked, for 
example covalently linked, between the 3' portion and 5' portion of the polynucleotide is 
effective to inhibit and preferably prevent, more preferably completely prevent amplification 
(which term includes any detectable copying) beyond the polymerisation blocking moiety, 
20 thus leaving the amplification product with a single stranded tail which is the 5' portion of the 
polynucleotide. A wide range of blocking moieties may be envisaged for this purpose. For 
example the polymerisation blocking moiety may comprise a bead, for example a polystyrene, 
glass or polyacrylamide bead or the polymerisation blocking moiety may comprise a 
transition metal such as for example iron, chromium, cobalt or nickel (for example in the form 
25 of a transition metal complex with the oligonucleotide tail and the target binding nucleotide 
moiety) or an element capable of substituting phosphorus such as for example arsenic, 
antimony or bismuth linked between the oligonucleotide tail and the target binding nucleotide 
moiety. The blocking moiety might similarly involve substitution of the usual phosphate 
linking groups, for example where oxygen is replaced, leading to inter alia 
30 phosphorodithioates, phosphorothioates, methylphosphonates, phosphoramidates such as 
phosphormorpholidates, or other residues known perse. Alternative blocking moieties 
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include any 3'-deoxynucleotide not recognised by restriction endonucleases and seco 
nucleotides which have no 2'-3' bond in the sugar ring and are also not recognised by 
restriction endonucleases. Newton et al. (Nucleic Acids Research. 2 1 (5): 1 155-1 162, 1993) 
and EP-B-41681 7, describe tailed primers with blocking moieties, interposed between the tail 
5 and the target binding portion of the primer, that can be incorporated into the polynucleotides 
of this invention. 

In a preferred aspect the spacer comprises a non-amplifiable blocking moiety such as 
hexethylene glycol (HEG) monomer, alone or combined with further nucleotides, more 
preferably alone. Alternatively the spacer could comprise material such as 2'-0-alkyl RNA 

10 which will not permit replication of a complementary strand by DNA polymerase enzymes 
that lack a reverse transcriptase function. 

To avoid false positive detection using the microarray and the targeting 
polynucleotides of the invention, it is desirable that none of the targeting polynucleotides are 
capable of hybridising to each other. Naturally, it is also desirable that none of the tails (the 

15 5' portions of the targeting polynucleotides) nor spacer moieties are capable of binding to any 
nucleic acid in the nucleic acid sample. If the capture hybridisation is to be effected in the 
presence of all the test sample nucleic acid, i.e. without separation of targeting polynucleotide 
bound nucleic acid from unbound nucleic acid, it is also desirable that none of the capture 
oligonucleotides on the solid support are capable of binding to any target nucleic acid in the 

20 original test sample. 

According to a further aspect of the invention there is provided a plurality of 
polynucleotides, each polynucleotide comprising a unique 3' portion substantially 
complementary to a unique target nucleic acid sequence which may be present in a sample, a 
5' portion complementary to one of a group of pre-selected oligonucleotides that each possess 
25 substantially the same melting temperature (T m ) and are attached at pre-defined positions to a 
solid support, and optionally a spacer moiety interposed between said 3' portion and said 5' 
portion. 

In a preferred embodiment, each of the unique 3 ' portions is precisely complementary 
with its cognate target sequence. It is to be expected however, that not all the target 
30 sequences, complementary to each and every unique 3' portion, will be present in a test 
sample. 
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In another preferred embodiment, each of the unique 5' portions from within the 
population of polynucleotides has substantially the same G-C content and substantially the 
same length as each of the other 5' end portions. 

In a more preferred embodiment, each of the unique 5' portions from within the 
5 population of polynucleotides has the same G-C content and length as each of the other 5' end 
portions. 

The 3' portion of the targeting polynucleotide represents the target binding sequence. 
This sequence can be of any length although it will preferably be between 8 and 60 
nucleotides in length, more preferably between 12 and 35 nucleotides in length. The target 
1 0 binding portion of the targeting polynucleotide sequence need not possess precise 

complementarity to the target sequence however, it must have sufficient complementarity (be 
substantially complementary) to bind specifically to the target sequence, that is to say under 
appropriate hybridisation stringency conditions the target binding region of the primer will 
hybridise to the target region (if present in the sample) to the exclusion of other regions. 
15 There are applications, such as with the ARMS technique, where nucleotide mismatches are 
incorporated into the target binding primer in order to assist in destabilising primer binding to 
incorrect target sequences. The presence of certain mismatches need not however, prevent 
primer binding to the desired target template sequence. In general however, and particularly 
when relying on allele specific hybridisation, it is preferred that the target binding portion of 
20 the targeting polynucleotides has precise complementarity to its target sequence. 

The expression "target nucleotide sequence" or "target nucleic acid" or "target S 
sequence" as used herein means a nucleotide or nucleic acid sequence comprising the 
sequence to be detected by probe or amplified by primer. Thus for example, if the present 
invention is applied to the diagnosis of p-thalassaemias a sample may contain as many as 60, 
25 for example 50, separate potential variant sequences. Each variant sequence is a potential 
target sequence for probe hybridisation or primer amplification according to the invention 
disclosed herein. 

Amplification of the target sequence can be effected by primer extension off one 
primer, however, in a preferred embodiment, each targeting oligonucleotide primer is 
30 accompanied by a companion primer which facilitates amplification of the target sequence 
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interposed between the two primers according to amplification procedures such as polymerase 
chain reaction (PCR) or ligase chain reaction (LCR), well known in the art. 

When the targeting polynucleotides are for use as allele-specific probes for detecting 
the presence of a target sequence, the spacer moiety inteiposed between the two hybridisable 
5 elements can serve to prevent steric interference of the two hybrid elements, these two 

elements being the hybrid duplex molecule consisting of the oligonucleotide targeting portion 
(the 3' portion) and its complementary sequence from the test sample, and the hybrid duplex 
consisting of the capture oligonucleotide on the array and the complementary sequence on the 
targeting oligonucleotide (the 5' portion). The spacer moiety might consist of straight chain 
10 or branched alkyl groups, polyglycol residues of any desired number of repeating unit, or 
modified nucleotides such as 2'-deoxyribose or 1 'napthalene-2'-deoxyribose may be 
inteiposed between the first and second portions of the targeting oligonucleotides so as to 
provide spatial distance between the capture hybrid and the target hybrid. 

The capture hybrid refers to the duplex molecule formed by annealing the capture 
15 oligonucleotide to the 5* portion (the single stranded "tail") of the targeting polynucleotide. 
This targeting polynucleotide may or may not already, have bound its target sequence. The 
target hybrid refers to the duplex molecule formed by annealing of the target binding portion 
(the 3 5 end) of the targeting polynucleotide to its target sequence. Although the target and 
capture hybrids have been referred to as duplex molecules, it will be apparent that each 
20 complex may not be entirely double stranded. 

An advantage of the invention is that the target product, either probe-target hybrid or 
amplified product, has a single-stranded portion which can be hybridised without denaturation 
to the solid support containing the immobilised pre-selected capture oligonucleotide 
sequences. Current array technology requires the target nucleic acids to be denatured or 
25 rendered single-stranded some other way, prior to capture on the array. 

The microarray and targeting polynucleotides of the invention are useful in any setting 
where it is desirable to identify the presence of one or more specific nucleic acid sequences 
from a population of sequences. Examples of uses are in de novo or re-sequencing methods, 
gene expression studies, fingerprinting, diagnostic identification, genotyping of organisms and 
30 environmental monitoring. 
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Any population of nucleic acids represents a suitable test sample. Sources of test 
sample nucleic acid include human cells such as circulating blood, buccal epithelial cells, 
cultured cells and tumour cells. Other mammalian tissues and cultured cells are also suitable 
sources of template nucleic acids. In addition, viruses, bacteriophage, bacteria, fungi and 
5 other micro-organisms can be the source of nucleic acid for analysis. The DNA may be 
genomic or it may be cloned in plasmids, bacteriophage, bacterial artificial chromosomes 
(BACs), yeast artificial chromosomes (YACs) or other vectors. RNA may be isolated directly 
from the relevant cells or it may be produced by in vitro priming from a suitable RNA 
promoter or by in vitro transcription. 
10 The present invention may be used for the detection of variation in genomic DNA 

whether human, animal or other. It finds particular use in the analysis of inherited or acquired 
diseases or disorders. A particular use is in the detection of inherited disease. It will be 
appreciated that the target nucleic acid is directly or indirectly linked to the sequence or region 
of interest for analysis. 

15 According to a further aspect of the invention there is provided a method for 

identifying the presence or absence of one or more test nucleic acid sequences in a sample, 
comprising the following steps: 

i) contacting a nucleic acid containing sample with a plurality of single stranded 
targeting polynucleotide molecules under suitable hybridisation conditions to ensure hybrid 

20 formation between the targeting nucleotide portion of the targeting polynucleotide molecule 
and its complementary target nucleic acid sequence in the sample, each of said targeting ^ 
polynucleotide molecules possessing, in addition to the targeting nucleotide portion, a unique 
single-stranded oligonucleotide tail sequence complementary to a unique capture 
oligonucleotide attached to a solid support; 

25 ii) contacting the population of hybrid molecules produced in step (i) to a solid support 

having attached thereon at pre-defined locations unique capture oligonucleotides, each capture 
oligonucleotide being complementary to one or other of the oligonucleotide tail sequences on 
the targeting molecules, under suitable conditions to ensure capture of each of the hybrid 
products to the solid support; and 

30 iii) determining the presence or absence of the captured product at each of the pre- 

defined locations on the solid support by measurement of a detectable label su.table to 
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identify the captured products; characterised in that substantially all of the oligonucleotide 
tails of the targeting polynucleotide molecules and the complementary sequences on the 
capture oligonucleotides possess substantially the same T m 

According to a further aspect of the invention there is provided a method for 
5 identifying the presence or absence of one or more test nucleic acid sequences in a sample, 
comprising: 

i) contacting a nucleic acid containing sample with a plurality of single stranded targeting 
polynucleotide molecules under suitable hybridisation conditions to ensure hybrid formation 
between the targeting nucleotide portion of the targeting polynucleotide molecule and its 

10 complementary target nucleic acid sequence in the sample, each of said targeting 

polynucleotide molecules possessing, in addition to the targeting nucleotide portion, a unique 
single-stranded oligonucleotide tail sequence complementary to a unique capture 
oligonucleotide sequence attached to a solid support, characterised in that substantially all of 
the oligonucleotides the capture sequence of each oligonucleotide and its complementary 

15 sequences on the tail possess substantially the same T m ; 

ii) contacting the population of hybrid molecules produced in step (i) to a solid support having 
attached thereon at pre-defined locations unique capture oligonucleotides, each capture 
oligonucleotide being complementary to one or other of the oligonucleotide tail sequences on 
the targeting molecules, under suitable conditions to ensure capture of each of the hybrid 

20 products to the solid support; and 

iii) determining the presence or absence of the captured product at each of the pre-defmei 
locations on the solid support by measurement of a detectable label suitable to identify the 
captured products. 

In a preferred embodiment, the hybrid molecules produced in step (i) are separated 
25 from the unhybridised targeting molecules prior to step (ii). This may be conveniently done 
by, for example, ethanol precipitation, column chromatography or gel filtration. 

In operation, the targeting polynucleotide binds to the target nucleic acid in the sample 
("the wet reaction"). If the targeting polynucleotide is serving as a probe, for example as in an 
allele specific hybridisation, the hybrid molecule is available for capture via the unhybndised 
30 single stranded tail portion of the targeting polynucleotide which is complementary to a 
capture oligonucleotide attached at pre-defined position on the solid support. 
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In a preferred embodiment the targeting polynucleotide possesses a detectable label. 

If the targeting polynucleotide is serving as an amplification primer, an amplification 
reaction must be carried out on the polynucleotide-bound target nucleic acid sample. Multiple 
rounds of primer extension from the primer can be effected in order to amplify the target 



5 nucleic acid such as up to 5, up to 10, up to 15, up to 20, up to 30, up to 40, up to 50 or more 
times. Conveniently, the targeting polynucleotide is serving as an amplification primer in an 
amplification system such as the polymerase chain reaction (PCR). In which case the target 
binding region (3' end) and the tail region (5' end) are advantageously arranged such that the 
tail region is non-amplifiable in the PCR amplification reaction but remains single stranded, 

10 thus ensuring that the amplified product has at least one single stranded tail complementary to 
a capture oligonucleotide on the solid. This facet of primer design is described in European 
Patent No. 0 416 817 and corresponding US Patent No. 5525494. In order to effect 
amplification using PCR-based or LCR-based procedures, a second primer is required. This 
second oligonucleotide primer may also have a non-amplifiable single stranded tail, possibly 

15 identical to the capture portion of the targeting polynucleotide so as to facilitate capture on the 
solid support. This second primer might also be suitably labelled to facilitate detection of the 
captured amplification product on the solid support. Alternatively, a suitable label (such as a 
labelled nucleoside tri -phosphate) might be incorporated into the amplified product during the 
amplification process. 

20 Any convenient template dependent polymerase may be used, this is preferably a 

thermostable polymerase enzyme such as Taq™, more preferably Taq Gold™. 

Similarly any convenient nucleoside triphosphates for conventional base pairing may 
be used. If required these may be modified for fluorescence. As these may affect 
polymerisation rates, for best results, the fluorescently labelled dNTPs are admixed with an 

25 excess of wild-type dNTPs, for example in an admixture of between about 1 :3 and 1:20. 

Further details of convenient polymerases, nucleoside triphosphates, other PCR 
reagents, primer design, instruments and consumables are given in "PCR" by C.R. Newton 
and A. Graham (The Introduction to Biotechniques series, Second Edition 1997, ISBN 1 
85996 011 1, Bios Scientific Publishers Limited, Oxford). Further guidance may be found in 

30 "Laboratory protocols for mutation detection" edited by Ulf Landegren, published by the 
Oxford University Press, Oxford, 1996, ISBN 0 19 857795 8. 
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In a preferred embodiment, the targeting polynucleotide molecules are amplification 
primers, such as those disclosed in EP-B-416817. In a further preferred embodiment, the 
targeting amplification primer molecules are used in conjunction with another amplification 
primer to amplify the target region of interest. In a more preferred embodiment, the 
5 amplification primers are amplification refractory mutation system (ARMS) primers, as 
described in EP-B-0332435 and corresponding US Patent No. 5595890, and additionally 
described in Newton et al. (Nucleic Acids Research. 17(7):2503-2516, 1989). ARMS is a 
technique suitable for detecting the presence or absence of variant nucleotides at a particular 
loci. It is particularly useful therefore, in diagnostic detection of mutated nucleic acid 

10 indicative of tumour phenotype, or in the detection of single nucleotide polymorphisms 

(SNPs). ARMS is a particularly useful technique where the target sequence is present in low 
copy number or there is a need to discriminate between two or more alleles, as for example in 
mutation detection. ARMS mutation detection enables the sensitive detection of specific 
alleles in the presence of an excess of alternate alleles. In this way somatic mutations can be 

15 detected in a background of wild type DNA. ARMS can readily be used to detect 1% mutant 
sequence in a 99% wild-type background. ARMS uses primers that allow amplification in an 
allele specific manner. Allele specificity is provided by the complementarity of the 3*- 
terminal base of a primer with its' respective allele. Amplification is inhibited when the 3'- 
terminal base of the primer is mismatched. This specificity is maintained when Taq DNA 

20 polymerase or other suitable enzyme lacking 3' to 5' proof-reading activity (such as Klenow) 
is used. An ARMS test is specific when the yield of product from the target allele exceeds the 
threshold of detection of the system in use and the yield of product from the nontarget allele is 
not detectable. As disclosed in EP-B-0332435 the ARMS primers will preferably possess 
destabilising mismatches incorporated close to the 3 '-terminal nucleotide that discriminates 

25 between the different alleles, to enhance specific binding and template amplification from the 
desired allele target sequence. The nearer to the 3' terminus of the primer that a destabilising 
mismatch is incorporated, the greater the effect on destabilisation (See also Newton et al. 
Nucleic Acids Research. 17:2503-2516, 1989). 



30 particular target sequence possess an oligonucleotide tail identical to that of the first targeting 
primer. In a further preferred embodiment, the second amplification primer possesses a 



In another preferred embodiment, the second amplification primer for amplifying any 
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detectable label, such as a fluorophor or radioisotope to enable the eventual detection of the 
amplified product on the microarray. In a further preferred embodiment either or both primer 
molecules possess a detectable label. Suitable labelling molecules are well known in the art. 
Alternatively, a suitable label can be incorporated into the amplified product during its 
5 synthesis. A suitable amplification reaction can then be performed so as to generate an 
amplification product. 

It will be apparent to the person skilled in the art that there are a large number of 
analytical procedures which may be used to detect the presence or absence of variant 
nucleotides at one or more polymorphic positions. Most of these rely on probe or primer 
10 hybridisations and thus, with addition of a suitable tail portion to enable capture on an array 
could be adopted for use with the microarray and method of the present invention. In general, 
the detection of allelic variation requires a mutation discrimination technique, optionally an 
amplification reaction and a signal generation system. Table 1 lists a number of mutation 
detection techniques, some based on the polymerase chain reaction (PCR). These may be 
15 used in combination with a number of signal generation systems, a selection of which is listed 
in Table 2. Many current methods for the detection of allelic variation are reviewed by Nollau 
et al., Clin. Chem. 43, 1 1 14-1 120, 1997; and in standard textbooks, for example "Laboratory 
Protocols for Mutation Detection", Ed. by U. Landegren, Oxford University Press, 1996 and 
"PCR", 2 nd Edition by Newton & Graham, BIOS Scientific Publishers Limited, 1997. PCR is 
20 described in United States patents nos. 4,683,195 and 4,683,202. 



Abbreviations: 



ALEX™ 


Amplification refractory mutation system linear extension 


APEX 


Arrayed primer extension 


ARMS™ 


Amplification refractory mutation system 


b-DNA 


Branched DNA 


CMC 


Chemical mismatch cleavage 


COPS 


Competitive oligonucleotide priming system 


FRET 


Fluorescence resonance energy transfer 


LCR 


Ligase chain reaction 
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\A A QT\ A 


Multiple allele specific diagnostic assay 
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OLA 


Oligonucleotide ligation assay 


PCR 


Polymerase chain reaction 


SDA 


Strand displacement amplification 


SSR 


Self sustained replication 



Table 1 - Mutation Detection Techniques 

MASDA; Taqman™ - US-5210015 & US-5487972; Molecular Beacons - Tyagi et al Nature 
Biotechnology. 14:303, (1996) and WO 95/13399; ARMS™; ALEX™ - European Patent No. 
5 EP 332435 Bl; COPS - Gibbs et al Nucleic Acids Research. 17:2347, 1989; APEX; OLA; 
SSR; NASB A; LCR; SDA; b-DNA; and minisequencing- Pastinen et al. Genome Research. 
7:606-614, 1997. 



Table 2 - Sienal Generation or Detection Systems 
10 Fluorescence: Fluorescence intensity, FRET, Fluorescence quenching, Fluorescence 
polarisation - United Kingdom Patent No. 2228998. 

Other: Chemiluminescence, Electrochemiluminescence, Raman, Radioactivity, Colorimetric, 
Hybridisation protection assay, Mass spectrometry. 

15 When the targeting polynucleotides are operating as allele-specific probes, the target 

nucleic acids is preferably labelled in order to be able to discriminate targeting 
polynucleotides bound to the capture oligonucleotides on the array that have target nucleic 
acid attached from those that do not. According to one way to do this, the nucleic acid are 
degraded to form fragments, degradation is preferably random using for example sonication or 

20 shearing, to generate average lengths of target nucleic acid around the lengths of the 

complementary sequences on the targeting oligonucleotides. These fragments can then be 
labelled. Any number of conventional detectable markers such as radioisotopes, fluorescent 
labels, chemiluminescent compounds, labelled binding proteins, magnetic labels, 
spectroscopic markers and linked enzymes might be used. One particular example well 

25 known in the art is end-labelling with 32p. Fluorescent labels are preferred because they are 
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less hazardous than radiolabels, they provide a strong signal with low background and various 
different fluorophors capable of absorbing light at different wavelengths and/or giving off 
different colour signals exist to enable comparative analysis in the same analysis. For 
example, fluorescein gives off a green colour, rhodamine gives off a red colour and both 
5 together give off a yellow colour. If the target bound targeting oligonucleotides are not 

separated from unbound targeting oligonucleotides (following step (i) of the method disclosed 
herein), and the target nucleic acid is not specifically labelled, other means of discriminating 
between those captured targeting oligonucleotides that have bound their cognate target 
sequence from the test sample from those that have not bound their test sample will be 
10 required. Suitable means for doing this include the use of intercalating agents (i.e. dyes such 
as ethidium bromide) that become incorporated into duplex nucleic acid or the use of labelled 
binding proteins or antibodies or other reagents that recognise helix formation (such as the 
target nucleic acid/targeting oligonucleotide hybrid), see for example US Patent No. 
4,582,789, or the use of a ligand binding to the minor groove such as Hoechst 33258 
15 fluorescent dye or the use of fluorescently labelled ligands which recognise the minor groove 
of DNA in a sequence specific manner, see for example, "Recognition of the Four Watson- 
Crick Base Pairs in the DNA Minor Groove by Synthetic Ligands." S. White, J. W. 
Szewczyk, J. M. Turner, E. E. Baird and P. B. Dervan, Nature, 391, 468 (1998). Convenient 
intercalators will be apparent to the person skilled in the art (cf Higuchi et al. BioTechnology. 
20 10:413-417,1992). In a preferred embodiment of the invention, capture of the hybrid 

detection product by the oligonucleotide on the solid support is detected using one or more ' 
minor groove binding probes. 

It will be apparent to the person skilled in the art that there are other conventional 
detection means that can be employed in order to detect target bound polynucleotides captured 
25 on the solid support. The essential feature is that hybridisation of the captured target nucleic 
acid bound targeting polynucleotide onto the capture oligonucleotide on the microarray causes 
a detectable change in a signalling system. Any convenient signalling system may be used, 
by way of non-limiting example we refer to the measurement of the change in fluorescence 
polarisation of a fluorescently labelled species (European Patent No. 0 382 433), DNA 
30 binding proteins, intercalators, or the incorporation of detectable (modified) dNTPs into the 
primer extension products or the target nucleic acids. 
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Further systems include two-component systems where a signal is created or abolished 
when the two components are brought into close proximity with one another. Alternatively, a 
signal is created or abolished when the two components are separated following binding of the 
target binding region. 

5 Both elements of the two component system may be provided on the same or different 

molecules. By way of example the elements are placed on different molecules, target specific 
binding displaces one of the molecules into solution leading to a detectable signal. One of the 
components may be attached to the capture oligonucleotide, or the solid support itself. For 
example, the array could consist of fluorescein labelled oligonucleotides of, for example 20 
10 residues in length. Prior to addition of the sample, a set of short quencher oligonucleotides 
(say 10 residues in length) complementary to the array and labelled with DABCYL (or methyl 
red) could be added to the array. The short complementary DABCYL oligonucleotides bind 
to the corresponding 'address' on the array and the fluorescence of the fluorescein labels is 
quenched. The sample is then purified so that unextended primers or unbound probes are 
15 separated from extended or bound products. The bound or amplified products are then added 
to the microarray and the tail portions which are fully complementary to the oligonucleotides 
on the microarray bind to the microarray with displacement of the quencher oligonucleotides. 
This results in the microarray oligonucleotides fluorescing as a result of the binding of the 
appropriate product (see Figure 1). In this format a fluorescent signal is produced by 
20 separating two species bound to the array surface. One advantage of this format is that it 
permits quality control of the array. When the fluorescent array is manufactured it can be 
scanned in a fluorescent scanner and any defects such as a probe oligonucleotide which has 
failed to attach to the surface will be detected as a non or weakly fluorescent spot on the array. 
Efficient quenching when the quencher oligonucleotides are added can also be monitored 
25 before the test products are added. 

Convenient two-component systems may be based on the use of energy transfer, for 
example between a fluorophore and a quencher. In a particular aspect of the invention the 
detection system comprises a fluorophore/quencher pair. Convenient and preferred 
attachment points for energy transfer partners may be determined by routine experimentation. 
30 A number of convenient fluorophore/quencher pairs are detailed in the literature (for example 
Glazer et al, Current Opinion in Biotechnology. 8:94-102, 1997,) and in catalogues such as 



O *9 B S Q NU 5 ,„ O :7 26Q JE 



WO 00/47767 





PCT/GB00/00357 



-24- 

those from Molecular Probes, Glen and Applied Biosystems (ABI). Any fluorescent molecule 
is suitable for signalling provided it may be detected on the instrumentation available. Most 
preferred are those compatible with the 488 nm line of the Argon ion laser (Fluorescein and 
Rhodamine derivatives). The quencher must be able to quench the dye in question and this 
5 may be via a Fluorescence Resonance Energy Transfer (FRET) mechanism involving a 

second, receptor fluorophore, or more preferably via a collisional mechanism involving a non- 
fluorogenic quencher such as DABCYL, which is a "Universal" quencher of fluorescence or 
methyl red. Furthermore it is preferred that the selected fluorophores and quenchers are 
incorporated, most conveniently via phosphoramidite chemistry, into the capture 
10 oligonucleotides and/or targeting polynucleotides and/or second primer required for example, 
when undertaking PCR-based amplification. FAM, a fluorescein dye with an excitation 
optimum at ~490nm, is a convenient donor. 



oligonucleotides are labelled with a fluorophore which either does not fluoresce at the 
irradiation frequency or is only weakly fluorescent at this frequency. The ARMS primers are 
labelled with a fluorophore which is substantially fluorescent at the irradiation frequency and 
which forms an energy transfer pair with the fluorophore label on the DNA array. When the 

20 purified ARMS products are bound to the array and subjected to irradiation there is energy 
transfer between the fluorophore on the ARMS product and that on the array. The array - 
fluorophore at the specific binding sites increase substantially in its fluorescent brightness and 
this may be detected by scanning the array. This is an example of increasing the fluorescent 
signal on the array by bringing two species close together by hybridisation. 

25 The oligonucleotide microarrays and targeting polynucleotides of the invention can be 

used for large scale hybridisation assays in numerous applications, including genetic and 
physical mapping of genomes, gene expression studies^ sequencing, fingerprinting and 
genotype mapping, genetic diagnosis and environmental monitoring. The microarray, 
targeting polynucleotides and method of the invention are particularly suitable for differentia] 

30 gene expression studies. When utilising the generic array and method of the present invention 
for assessing expression levels of certain genes, RNA can be isolated from a cell or cell 



15 



In another embodiment of the invention the oligonucleotides on the support ("array") 
are detectably labelled. 

A further embodiment consists of a fluorescently labelled DNA array where the array 
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population and labelled, for example by attaching a fluorescent molecule to isolated RNA or 
by end labelling using T4 polynucleotide kinase. Alternatively, mRNA can be reverse 
transcribed into cDNA and a suitable label incorporated during cDNA synthesis. 
Fragmentation of these labelled molecules can then precede hybridisation to the targeting 
5 polynucleotides prior to capture on the generic array of the invention. As with conventional 
differential expression studies using gene chips, different fluorescent labels (for example 
Cy3 (green) or Cy5(red)-labelled deoxyuridine triphosphate) can be used on different test 
samples (i.e. induced versus un-induced) to enable direct comparison of gene expression 
levels in the two samples on the same array. The relative fluorescence intensity of each fluor 



10 at each array element (capture location) provides a measurement of the relative abundance of 
the respective RNA in the two cell populations. 

With current differential expression studies utilising oligonucleotide arrays, because of 
the differences in hybridisation properties between short oligonucleotide probes, each target 
gene must be represented by several oligonucleotides (typically 20 or more) on the chip. In 

15 addition, a partner control oligonucleotide identical to each oligonucleotide, except for one of 
the central nucleotides, is included on the array to serve as an internal control for 
hybridisation sensitivity. Thus, whereas cDNA arrays only require each gene to be 
represented by a single hybridisation partner on the array, with the oligonucleotide arrays, 
each test gene must be represented by at least 40 distinct oligonucleotides each at a different 

20 position on the array. 

The use of many, for example 5-80, preferably , 10-20, 15-25, 20-30, 25-35, 35-5*) 
distinct targeting polynucleotides per target gene (each range being a separate and 
independent embodiment of the invention), each complementary to different regions of a 
particular target gene, and each having the identical tail sequence for capture by a unique 

25 capture oligonucleotide on the array, obviates the need for having many distinct 

oligonucleotides at different locations on the array. Moreover, because each of the capture 
oligonucleotides possess the same or substantially the same T m , approximately equivalent 
capture onto the solid support is expected. This means that the generic array and method of 
the invention should be particularly suitable for differential expression studies where 

30 quantitative analyses are desired. Quantitative analyses in current differential expression 
studies are limited because of the different hybridisation stabilities of each capture:target 
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duplex sequence. Some sequences will bind more efficiently under a given set of 
hybridisation conditions than others, hampering precise quantitative analyses. This situation 



oligonucleotides on the array possess the same or substantially the same T m . 
5 Thus according to a further aspect of the invention there is provided the use of the 

generic microarray, targeting polynucleotides and method of the invention in determining the 
expression levels of gene(s) from a sample. 

A suitable sample might be a tissue sample, for example a biopsy or bodily fluid, such 
as blood, sample, or a cell sample, for example epithelial or buccal cells, or a cultured cell or 
10 cell line such as a mammalian cell or cell line, or a bacterial or yeasf cell. Alternatively, it 
may be from a whole organism, such as Arabidopsis thaliana. Any sample containing one or 
a plurality of different genes is suitable. 

Although preferred, the use of numerous distinct targeting polynucleotides each 
capable of binding to a different region of a particular gene, so as to overcome the different 
15 hybridisation properties problem identified above that short oligonucleotides probes have, 
need not be restricted to use with arrays comprising oligonucleotides that have substantially 
the same T m . 

Thus, according to another aspect there is provided a method for quantifying the 
expression level of a gene comprising: 
20 (i) converting mRNA from a test sample into cDNA; 

(ii) optionally, fragmenting said newly synthesised cDNA into appropriate length s 
nucleic acid fragments; 



25 occur, each polynucleotide molecule comprising a unique 3' portion substantially 

complementary to a unique region of the cDNA, a 5' tail portion complementary to one of a 
group of pre-selected oligonucleotides that are attached at pre-defined positions to a solid 
support, and optionally a spacer moiety interposed between said 3' and 5' portions; 



30 immobilised at pre-determined positions a plurality of capture oligonucleotide sequences each 
complementary to one or other of the 5' tail portions of the targeting polynucleotides so as to 



does not arise with the microarray of the present invention because all the capture 



(iii) contacting the cDNA with a plurality of targeting polynucleotide molecules under 
suitable conditions to allow hybridisation between substantially complementary sequences to 



(iv) contacting the components from step (iii) with a substrate on which is 
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allow the tailed cDNA/targeting polynucleotide duplex molecules to bind to their 
complementary capture oligonucleotide on the substrate; and 

(v) detecting the amount of bound cDNA/targeting polynucleotide duplex at each 
position on the substrate. 
5 According to a preferred embodiment of this particular aspect, unhybridised targeting 

polynucleotides are removed from the reaction mixture after step (iii). In another preferred 
embodiment, the sub-set of targeting polynucleotides directed to a specific gene all possess 
the same 5' tail portion sequence so that all targeting polynucleotide/cDNA duplex molecules 
formed can be captured at the same location (by the same oligonucleotide) on the support 
10 (array). In another embodiment, the method is used to determine expression levels of various 
genes in a test sample, each gene capable of being detected by a different sub-set of targeting 
polynucleotides and addressed to distinct positions on the support. In another embodiment, 
the cDNA generated in step (i) is detectably labelled. In another embodiment the gene or each 
gene to be detected, as represented by cDNA molecules produced in step (i), is detected by 
15 between 5 and 80 distinct polynucleotides that bind at distinct parts of the gene. 

According to a further aspect of the invention there is provided, a method for 
identifying the differential expression of each of a plurality of genes in a first cell type with 
respect to expression of the same genes in a second cell type, said method comprising: 

(i) isolating mRNA from each cell type and converting said mRNA into cDNA with 
20 incorporation of a different fluorescent label into the newly synthesised cDNA for each celL 



(ii) optionally, fragmenting said newly synthesised cDNA into appropriate length 
nucleic acid fragments; 

(iii) contacting said labelled nucleic acid with a plurality of targeting polynucleotide 
25 molecules under suitable conditions to enable hybridisation between substantially 

complementary sequences to occur, each polynucleotide molecule comprising a unique 3' 
portion substantially complementary to a unique target nucleic acid sequence which may be 
present in a sample, a 5' tail portion complementary to one of a group of pre-selected 
oligonucleotides that each possess substantially the same melting temperature (T m ) and are 
30 attached at pre-defined positions to a solid support, and optionally a spacer moiety interposed 
between said 3' portion and said 5 1 portion, 



type; 
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20 



25 



(iv) detectably hybridising the hybridised products from step (iii) to a solid surface on 
which is immobilised at pre-determined positions a plurality of oligonucleotide sequences 
complementary to one or other of the 5' tail portions of the targeting polynucleotides; and 

(v) examining the solid support by fluorescence under fluorescence excitation 
conditions to detect the bound nucleic acid from each cell type, whereby the 
amount of labelled nucleic acid from each cell type at each particular location on 
the solid surface can be detected on the basis of the different fluorescence emission 
colour produced by the different labels incorporated. 

According to a further aspect of the invention there is provided a method for 
identifying the differential expression of each of a plurality of genes in a first cell type with 
respect to expression of the same genes in a second cell type, said method comprising: 
(i) isolating mRNA from each cell type and converting said mRNA into cDNA with 

incorporation of a different fluorescent label into the newly synthesised cDNA for each 



(ii) optionally, fragmenting said newly synthesised cDNA into appropriate length nucleic 
acid fragments; 

(iii) contacting said labelled nucleic acid with a plurality of targeting polynucleotide 
molecules under suitable conditions to effect hybridisation between substantially 
complementary sequences, each polynucleotide molecule comprising a unique 3' 
portion substantially complementary to a unique target nucleic acid sequence which may 
be present in a sample, a 5' tail portion complementary to one of a group of pre-selected 
oligonucleotides that each possess substantially the same melting temperature (T m ) and 
are attached at pre-defined positions to a solid support, and optionally a spacer moiety 
interposed between said 3* and 5 1 portions; 

(iv) contacting the hybridised products from step (iii) to a solid surface on which is 
immobilised at pre-determined positions a plurality of oligonucleotide sequences 
complementary to one or other of the 5' tail portions of the targeting polynucleotides; 
and, 

(v) examining the solid support by fluorescence under fluorescence excitation 
conditions to detect the bound nucleic acid from each cell type, whereby the 
amount of labelled nucleic acid from each cell type at each particular location on 



cell type; 
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the solid surface can be detected on the basis of the different fluorescence emission 
colour produced by the different labels incorporated. 
In a preferred embodiment for differential expression studies, each target gene is 
detected by a plurality of targeting polynucleotides that bind at distinct parts of the target 
5 gene. In a more preferred embodiment, there are between 5 and 80, preferably about 10-20, 
15-25, 20-30, 25-35, 35-50 distinct targeting oligonucleotides per target gene. Each range 
being a separate and independent embodiment of the invention. In another preferred 
embodiment, each of the targeting polynucleotide for a particular gene possesses an identical 
single stranded oligonucleotide tail portion capable of capturing the target:polynucleotide 

10 hybrid molecule onto a solid support at a pre-defined position. In another embodiment of the 
invention the identical gene detection products from each cell line are captured by different 
oligonucleotides on the array. For example, sequence location 200-230 of gene X isolated 
from normal tissue sample is captured at position A whereas sequence location 200-230 of 
gene X isolated from tumour tissue is captured at position B. Then sequence location 350-380 

15 of gene X isolated from normal tissue sample can either also be captured at location A (to 
pool all the results for one gene from one cell type at one site) or at its own unique location C. 
Similarly, sequence location 350-380 of gene X isolated from tumour tissue can either be 
captured at position B or position D. Although not essential, it is preferred that the nucleotide 
sequence of each of the capture portions of the targeting polynucleotides and complementary 

20 sequences on the capture oligonucleotides on the solid support have substantially the same 

In order to facilitate more accurate quantitative analysis, hybridisation conditions are 
adopted to ensure maximum annealing of the targeting polynucleotides to their target 
sequences. This may be effected by pooling together those targeting polynucleotides whose 

25 target binding sequence possesses on binding to its target approximately the same T m as the 
others in the pool. More preferably, the target binding portion of each targeting 
polynucleotide in a particular pool is of the same nucleotide length and has the same G-C 
content as the others in the pool. According to a preferred embodiment therefore, each pool of 
targeting polynucleotides is hybridised to a portion of the test nucleic acid sample under 

30 optimum conditions for hybrid formation with their respective target sequences. The 

hybridisation reactions from each pool of targeting polynucleotides arc then pooled together. 
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Following this, capture of the target bound sequences onto the solid support is effected using 
hybridisation conditions adapted according to the particular T m of the capture duplexes. 

For whatever application, hybridisation conditions chosen are designed to be as close 
as possible to the T m of the duplexes. The concentration of salt in the hybridisation solution 
5 used is particularly significant. At 1M NaCl, G:C base pairs are more stable than A.T base 
pairs. Similarly, double stranded oligonucleotides with a higher G-C content have a higher 
T m than those of the same length but with a higher A-T content. If slight differences, i.e. 
single nucleotide differences, amongst the target nucleic acids need to be distinguished, 
establishing optimum hybridisation conditions is important, particularly, when the 
10 hybridisable length of the oligonucleotides is small (< approximately 30-mers). Where, 

because of the diverse composition of the target sequences, there is a broad range of T m , either 
a less than optimum compromise set of hybridisation conditions could be adopted, or 
conditions could be manipulated so as to diminish the T m dependence on nucleotide 
composition by using chaotropic hybridisation solutions. This can be effected, for example, 
15 by incorporation into the hybridisation solution of a tertiary or quaternary amide. 

Tetramethylamrnoniumchloride (TMAC1) is particularly suitable when used at concentrations 
of between 2M and 5.5M. A preferred concentration range being 3M - 4M. Compared to the 
presence of 1 M NaCl in the hybridisation solution, use of up to 5M TMAC1 can enhance 
hybridisation specificity by up to 40-fold. A preferred means of ensuring maximum hybrid 
20 formation despite there being a range of T m due to target sequence composition, is to divide 
the population of targeting polynucleotides into groups according to their optimum T m , ana 5 
then to undertake separate hybridisation reactions using sub-groups of pooled targeting 
polynucleotides that are grouped according to the T m of the targeting polynucleotidertarget 
sequence hybrid portion. Each hybridisation can then be carried out under optimum 
25 hybridisation conditions for the particular group of targeting polynucleotides. In this manner, 
optimum hybridisation conditions can be adopted which will ensure approximately equivalent 
duplex formation. This may be of particular importance if quantitative analysis is required. 
The products from the different hybridisation reactions can then be pooled ready for capture 
hybridisation to the solid support (Reaction B, see below). An example of a suitable 
30 hybridisation solution involving oligonucleotides of between 15 and 50 nucleotides is: 3M 
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TMAC1, 0.01M sodium phosphate (pH 6.8), ImM EDTA(pH 7.6), 0.5% SDS 9 100 ^ig/ml 
denatured, fragmented salmon sperm DNA and 0.1% dried skimmed milk (i.e. Marvel™). 

Use of the generic microarray and targeting polynucleotides of the invention requires 
two hybridisation reactions to be undertaken. One (A) involves hybrid formation between the 
5 targeting polynucleotides and the target nucleic acid in the test sample ("the wet reaction"). 
The other (B) involves hybrid formation between the targeting polynucleotides and the 
capture oligonucleotides bound to the solid support. Depending on the type of study 
undertaken, the hybridisations can be carried out in either order or together, although it is 
preferred that the wet reaction A is carried out prior to reaction B. 

10 With differential expression studies it is preferred that hybridisation A be carried out 

under conditions that ensure maximal hybridisation of the targeting polynucleotides to target 
sequence. In general, expression monitoring experiments require long overnight 
hybridisations with low stringencies (higher salt concentrations, and lower temperatures) in 
order to allow hybrid formation between the target nucleic acids and probes that have different 

15 stabilities. This also enhances annealing of low copy number sequences. This "wet reaction" 
can therefore be carried out first in order to allow maximum annealing. Capture of the hybrid 
molecules from the "wet reaction" can then be effected using greater hybridisation 
stringencies (at lower salt concentrations and higher temperatures) over shorter time periods 
(i.e. 1-3 hours). The optimum hybridisation conditions can however, be determined from the 

20 expected T m of the capture duplex molecules. 

This invention is concerned with a novel microarray of universal use comprising a 
plurality of oligonucleotides, each possessing substantially the same T m , attached at pre- 
defined positions to a solid support. The design and construction of solid support is well 
known in the art. Essentially, any conceivable solid substrate may be employed in the 

25 invention. A suitable substrate is a material having a rigid or semi-rigid surface, generally 
insoluble in a solvent of interest such as water. Specific suitable substrates are glass, plastics, 
polymers, polysaccharides, resins, metal, silica or silica-based material, nylon or 
nitrocellulose filters, and the like. The solid support may comprise a single sheet of a suitable 
material such as glass, silicon or plastic so that the pre-selected oligonucleotides are 

30 positioned at pre-defined sites based on each oligonucleotide having a distinct and 

distinguishable set of x,y co-ordinates. Alternatively the solid support may comprise a set of 
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beads of a suitable material such as glass or plastic so that the pre-selected oligonucleotides 
are positioned at pre-defined sites based on each oligonucleotide residing on a distinct bead. 
In a preferred embodiment the substrate and/or its surface will be flat glass or single-crystal 
silicon. Suitable examples of existing laboratory materials that can be utilised are glass 
5 microscope slides and microtitre (such as 96-well) plates. The surface of the substrate will 
preferably contain reactive groups such as carboxyl, amino, hydroxyl, or the like. A 
polycationic polymer such as polylysine is particularly useful. Most preferably, with 
fluorescence detection, the surface is non-fluorescent at the wavelength that the analysis is to 
be performed. The surface of the substrate is also preferably provided with a layer of cross- 

10 linking groups to assist attachment of the oligonucleotides to the support. These cross-linking 
groups will preferably be of sufficient length to permit the oligonucleotides attached to 
interact freely with their binding partners in solution. Crosslinking groups may be selected 
from any suitable class of compounds, for example, aryl acetylenes, ethylene glycol oligomers 
containing 2-1 monomer units, diamines, diacids, amino acids, and the like. The cross-linking 

15 groups may be attached by a variety of methods which are readily apparent to the person 

skilled in the art. For example, by esterification or amidation reactions of an activated ester of 
the linking group with a reactive hydroxyl or amine group on the free end of the cross-linking 
group. 

The detection of specific interactions may be performed by detecting the positions 
20 where the labelled target sequences are attached to the array. Radiolabeled probes can be 
detected using conventional autoradiography techniques. Use of scanning autoradiography 
with a digitised scanner and suitable software for analysing the results is preferred. Where the 
label is a fluorescent label, the apparatus described, e.g in International Publication No. WO 
90/15070, US Patent No. 5, 143,854 or US Patent No. 5,744,305 may be advantageously 
25 applied. Indeed, most array formats use fluorescent readouts to detect labelled capturertarget 
duplex formation. Laser confocal fluorescence microscopy is another technique routinely in 
use (M.J.Kozal et al., Nature Medicine. 2:753-759, 1996). Mass spectrometry may also be 
used to detect oligonucleotides bound to a DNA array (Little DP et al, Analytical Chemistry. 
69(22):4540-4546, 1997). Whatever the reporter system used, sophisticated gadgetry and 
30 software may be required in order to interpret large numbers of readouts into meaningful data 
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(such as described, for example, in US Patent No. 5,800,992 or International Publication No. 
WO 90/04652). 

Once a particular sequence has, or group of sequences have, been hybridised to the 
microarray and the pattern of hybridisation analysed, the microarray can be treated to remove 
5 the bound sequences in preparation for reuse of the microarray by exposure to a second or 
subsequent set of target sequences. In order to do this the hybrid duplexes are disrupted and 
the solid support matrix treated in order to remove all traces of the original target. To effect 
this, the matrix may be treated with various detergents or solvents to which the substrate, the 
oligonucleotides and the linkages to the substrate are inert. This treatment may involve an 
10 elevated temperature treatment, treatment with organic or inorganic solvents, modifications in 
pH, and other means for disrupting specific interactions. Examples of methods that could be 
used are: (1) Washing the array with 50 mM sodium hydroxide to disrupt base pairing by high 
pH. (2) Washing the array with pure water and at high temperature (e.g. > 80°C) to disrupt 
base pairing by high stringency. (3) Addition of oligonucleotide sequences complementary to 
15 the tail sequences (and identical to the chip sequences) to disrupt base pairing by exchange 
with the sequences in free solution. Other methods for disrupting duplex formation are well 
known in the art (see for example Sambrook et al. ibid). Because the microarray of the 
invention is not a custom chip, but rather a generic chip which interacts with specific custom 
targeting oligonucleotides, once the microarray has been cleaned, it can be reused in any 
20 appropriate procedure and is not limited to reuse in the particular procedure used before. The 
discriminatory ability lies with the "wet reaction" involving the target nucleic acids with the 
custom targeting polynucleotides that co-operate with the generic microarray of the invention. 

According to a further aspect of the invention there is provided a kit for detecting the 
presence or absence of one or more target nucleic acid sequences contained in a sample, 
25 which kit comprises:- 

(i) a plurality of polynucleotides, each polynucleotide comprising a unique 3' portion 
substantially complementary to a unique target nucleic acid sequence which may be 
present in a sample, a 5' portion complementary to one of a group of pre-selected 
oligonucleotides that each possess substantially the same melting temperature (T m ) and are 
30 attached at pre-defined positions to a solid support, and optionally a spacer moiety 
interposed between said 3' portion and said 5' portion; and 
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(ii) a solid support having immobilised thereon a plurality of pre-selected oligonucleotides at 
pre-defined positionally distinct sites, characterised in that the composition of each of the 
oligonucleotides is such that they all have substantially the same melting temperature (T m ) 
when annealed to their complementary sequence, each capture oligonucleotide having a 
5 sequence complementary to a 5' portion of one of the polynucleotides in (i). 

In a preferred embodiment the kit also contains some or all four different nucleoside 
triphosphates and/or an agent for polymerisation of the nucleoside triphosphates and/or 
instructions for use. In accordance with the general principle of the invention, the targeting 
portion of the polynucleotides may be acting as hybridisation probes or 
10 amplification/detection primers. In a preferred embodiment the kit'comprises a set of at least 
two primers for each target sequence, the terminal nucleotide of at least one primer being 
complementary to a suspected variant nucleotide associated with a known genetic disorder 
and at least one of the other primers being a companion primer as described hereinbefore. The 
kit may therefore comprise sets of oligonucleotide primers, each set targeting different alleles 
15 at a specific loci. In a further embodiment, the polynucleotide molecules referred to in (i) are 
ARMS primers with non-amplifiable tails as described hereinbefore. In a further embodiment 
the solid support is a microscope slide or microtitre plate, such as a 96-well plate. Such kits 
may also comprise control DNA and control primers or probes, and the like. 



Figure 2a - Results from Experiment 1, 84- ARMS primer multiplex on 1% p53 codon 175 
CAC admixture template and wild-type; OD 405nm mutant (ODm) and wild-type (ODw) plot 
for the 1 1 test ARMS primers and one control primer. 
25 Figure 2b - Results from Experiment 1, (ODm/ODw)-l ratio of absorbance (405nm) on p53 
wild-type and mutant admixture for 1% 175 CAC. 

Figure 3a - Results from Experiment 1, 84- ARMS primer multiplex on 5% p53 codon 175 
CAC admixture template and wild-type; OD 405nrn mutant (ODm) and wild-type (ODw) plot 
for the 1 1 test ARMS primers and one control primer. 
30 Figure 3b - Results from Experiment 1, (ODm/ODw)-l ratio of absorbance (405nm) on p53 
wild-type and mutant admixture for 5% 175 CAC. 



The invention will now be further illustrated by the following non-limiting examples. 
20 The examples refer to the following figures, in which: 

Figure 1 - Illustrates the use of quenching oligonucleotides for measuring primer binding. \ 
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Figure 4a - Results from Experiment 2, 84- ARMS primer multiplex on 1% p53 codon 175 
CAC admixture template and wild-type; OD 405nm mutant (ODm) and wild-type (ODw) plot 
for the 1 7 test ARMS primers and two control primer. 

Figure 4b - Results from Experiment 2, (ODm/ODw)-l ratio of absorbance (405nm) on p53 
5 wild-type and mutant admixture for 10% 1 75 CAC. 



Example 1 

Multiplex tailed ARMS assay to detect p53 mutations . 

Many potential mutation sites in p53 have been identified (P.Hainaut et al, Nucleic 
10 Acids Research. 26(l):205-2 13, 1998). 

80 ARMS primers were designed for the specific detection of some of the mutations in 
exons 5-8 of the p53 tumour suppresser gene (Table 3 lists the 80 codon positions and 
specific mutations for which ARMS primers were designed and prepared). Uniquely 
identifying non-amplifiable tails (with T m s in the range of 53°C to 58°C) with hexaethylene 
15 glycol links between the primer and the tail sequences were added to 19 of these ARMS " 
primers (marked * in Table 3). The 80 ARMS primers were then multiplexed together with 2 
reverse primers designed to give PCR products with the ARMS primers specific for mutations 
in exons 5&6 and 8 respectively. Tailed primer sets which act as control primers for the 
detection of p53 exons 5&6 and exon 8 sequence were also included. 
20 Table 3. 

List of potential p53 mutations on which the ARMS primers were prepared. " 



132 CAG* 


132 AGG 


135 TAC 


141 TAC 


151 TCC 


151 ACC* 


152 CTG 


154 GTC 


156 CCC* 


157 TTC* 


158 CTC* 


158 CAC 


159 GAC* 


159 GTC 


161 ACC 


163 TGC 


173 ATG 


173 TTG* 


175 CAC* 


176 TTC* 


177 CGC* 


179 CGT 


179 TAT 


192 TAG 


193 CGT 


195 ACC 


196 TGA 


205 TGT 


213 TGA 


220 TGT* 


228 AAC 


234 TGC 


237 ATA 


238 TAT 


241 TTC 


242 TTC 


244 TGC 


245 TGC 


245 GAC 


245 AGC 


245 CGC 


245 GTC 


245 GCC 


248 CTG 


248 CAG 


248 TGG 


248 GGG 


248 CCG 


249AGC 


249AGT 


249 ACG 


249 ATG 


249 GGG 


249 AAG 
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258 AAA 


258 GGA 


266 AGA 


266 GAA 


266 TGA 


266 GTA 


272 ATG 


273 CCT* 


273 CTT* 


273 CAT* 


273 TGT* 


273 AGT 


273 GGT 


275 TAT 


278 CTT 


278 TCT 


280 ACA* 


282 TGG 


282 CAG 


282 CCG 


282 CTG 


282 GGG 


285 AAG* 


286 AAA* 


298 TAG* 


306 TGA 











* - denotes the mutant codon for which a tailed ARMS primer was prepared. 



Templates containing mutations in the p53 gene were prepared by primer directed 
5 mutagenesis (see Higachi et al. NAR. 16:7350-7367, 1988). This gave templates of 500 - 
800 bp containing the mutation of interest. Wild type templates were prepared by PGR 
amplification of wild-type DNA to give the corresponding 500 - 800 bp fragments. 

All templates were quantified relatively by real time analysis on an ABI Prism 7700 
using a quantitative PCR reaction. Quantitated cassettes (mutant synthetic templates prepared 
10 by site directed mutagenesis) were then used to prepare wild-type/mutant admixtures. 

Oligonucleotides of precise complementary sequence to the 19 tail sequences were 
synthesised with 3' biotin moieties. These capture sequences were bound to the wells of a 
streptavidin coated microtitre plate (one capture sequence per well). 

Reaction conditions: Each ARMS primer was present at 50 nM concentration, the 
15 reverse primers were present at 500mM concentration and wild type dNTPs at 50 fiM each. 
Fluorescein-dUTP was also included at 0.5 \iM. The buffer was 50 mM KC1, 10 mM tris, 1.2 
mM MgCl 2 at pH 8.3. 4 Units of AmpliTaq Gold™ were used per amplification. 10 5 copies 
of template were added. Cycling conditions were 94°C for 20 minutes then 35 cycles of 
(94°C for 1 minute, 60°C for 1 minute). In each experiment, three parallel experiments were 
20 run using (a) wild type template, (b) mutant/wild-type admixture template and (c) a no- 
template control. 

Following amplification, the PCR products were divided between the capture wells of 
the microtitre plate. Hybridisation between the PCR products and the capture oligos took 
place overnight at 55°C in 3M TMAC, 1M Tris (pH 7.5), 0.5M EDTA, 0.01% Triton-X-100, 
25 0.1mg/ml herring sperm DNA . Unbound products were then washed off (2 washes in 

phosphate buffered saline (PBS)). The PCR products were detected by ELISA detection of 



flj O Q O HJL is iES 






WO 00/47767 



PCT/GB00/00357 



-37- 



incorporated fluorescein-dUTP using an anti-fluorescein-alkaline phosphatase antibody- 
enzyme conjugate. Colour development was by addition of p-nitrophenyl phosphate and the 
OD 405 was determined after 30 minutes. 

For each primer being examined the following ODs were obtained: (i) the mutant 
5 template termed ODm; (ii) the wild-type template termed ODw; and, (iii) the no-template 
control. The no-template control ODs were subtracted from ODm and ODw to give 
background corrected values. ODm, ODw values and the (ODm/ODw -1) ratio were then 
plotted. 



10 Experiment 1 - Detection of a single point mutation at codon position 175 of p53 present as 
template at a concentration of 1 % or 5%. 

Admixtures were prepared between template containing p53 mutant codon 175 CAC 
and wild-type template with the mutant sequence present at 1% and 5% of the total. 

15 Specific tailed ARMS primer used to detect p53 175 CAC mutation: 

5 , -GCTTTATGTCCACAGATTTC*ATACACAGCACATGACGGAGGTTGTGAGCCA-3 , 
SEQ ID NO.l represents the 5' tail portion; SEQ ID No. 2 represents the 3' targeting portion. 
The * denotes the HEG group. 

20 Specific reverse primer used with the above ARMS primer: 

5 , -ACCCGGAGGGCCACTGACAAC-3 ! (SEQ ID No. 3) ' 

For each admixture three separate amplifications (using the multiplex of 80 mutant 
ARMS primers and 2 control primers, 19 of which had tails so that they could be captured-see 
25 Table 3 ) were carried out on: (a) the admixture; (b) wild-type template; and, (c) no-template 
control. The amplification products were then each added to a separate array consisting of a 
microtitre dish with 1 1 capture oligos for a subset of the primers plus 1 capture oligo for the 
exon 5&6 control reaction immobilised thereon. 



30 shown in figures 2a and 3a. The figures show that the primer for 175 CAC gives a signal on 
the mutant template (bars) which is higher than the one obtained with wild-type template 



The ODm and ODw values with the relevant no-template control values subtracted are 
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(diamonds). The size of the signal is dependant on the amount of mutant template present in 
the substrate and is therefore greater for the 5% mutant template than for the 1% mutant 
template. 

The ability of the primer for 175 CAC to differentially detect low levels of mutant 
5 sequence can be also seen by calculating the (ODm/ODw-1) ratios (see Figures 2b and 3b). 
From these values it can be clearly seen that only the primer for 175 CAC demonstrates a 
significantly higher OD on the mutant template than on the wild-type template. 

Careful examination of the OD405nm data and the (ODm/ODw)-l plots permits one 
to distinguish between primers which are selectively detecting mutant sequence and those that 
10 are giving unselective amplification from wild-type as well as mutant template. Experiment 2 
illustrates this more fully. 

Experiment 2 

Detection of a single point mutation at codon position 175 of p53 present as template at a 
15 concentration of 1 0%. 

An admixture was prepared between template containing p53 mutant codon 175 CAC 
at 10% in a wild type background. As in experiment 1, three separate amplifications were 
carried out on: (a) the admixture; (b) wild-type template; and, (c) no-template control. The 
products of the amplifications were then each added to a separate array consisting of 1 7 

20 capture oligos for the primers plus 2 capture oligos for exon 5, 6 and exon 8 control reactions. 
The ODm and ODw values with the relevant no-template control values subtracted are shown 
in Figure 4a. This figure shows that the primer for 175 CAC has given a signal on the mutant 
template (bars) which is far higher than the one obtained with wild-type template (diamonds). 
It can also be seen that the 220 TGT mutation has a propensity to prime and give detectable 

25 product with wild-type template as well as on the mutant template. That this primer is not 
erroneously detecting mutant sequence can be seen by calculating the (ODm/ODw-1) ratios 
(see Figure 4b) in which case it can be clearly seen that only the primer for 1 75 CAC is giving 
a higher OD on the mutant template than on the wild-type template. 



30 
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Summary: 

Example 1 demonstrates the use of ARMS for the sensitive detection of under- 
represented sequences in combination with the use of non-amplifiable tails and 
oligonucleotide arrays to provide a method for the large scale multiplex analysis of 
5 polymorphisms in gene sequences. 

The use of ARMS in solution phase permits the more sensitive detection of gene 
variation than can be achieved with solid phase allele specific oligonucleotide (ASO) 
hybridisation. 

The use of non-amplifiable tails and oligonucleotide arrays presents a more generic 
10 and widely applicable technique then can be achieved with ASO arrays which must be 
individually designed on a target to target basis. In this way a maximally efficient mutation 
detection system is produced because each component used in the process is suited to the 
process it is required to carry out. Mutation detection can be carried out in solution phase 
using ARMS and DNA arrays used for separating complex mixtures of oligonucleotide 
15 sequences. 

The use of non-amplifiable tails and oligonucleotide arrays presents a simpler 
hybridisation technique then can be achieved with ASO arrays. With ASO arrays all of the 
probe to targets hybridisations require to be carried out under the same conditions including 
the same temperature and buffer. These conditions may not be ideal for many of the probe to 
20 target hybridisations which require to be performed. With generic arrays and non-amplifiable 
tails a single set of unified hybridisation conditions can be pre-selected which permit all pfobe 
to target hybridisations to be carried out under the same optimal conditions because all probe 
to target hybridisations can be selected to occur between sequences of substantially the same 
T m and GC content. 

25 The use of non-amplifiable tails and oligonucleotide arrays also presents a more cost 

effective way of screening for mutations than can be achieved with ASO arrays. The 
manufacture of the array used in the screening process is cheaper than is the case with specific 
ASO arrays because: 

L Depending on its use, the generic array will likely require far fewer capture sequences for 
30 the analysis of each variant in the gene of interest than is possible with a specific ASO 
array and therefore, is far simpler and cheaper to manufacture. 
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2. Once the array has been designed it may be used for the analysis of any gene target. The 
costs of designing and developing new arrays for new gene targets is avoided. 

3. With one array being used for the analysis of all gene targets economy of scale can be 
realised during manufacture compared to the situation where smaller manufacturing runs 

5 are undertaken to produce a multitude of specific arrays. 
Example 2 

Design of suitable oligonucleotides with substantially the same T^ . 

Oligonucleotide sequences of substantially the same T m can conveniently be generated 
by use of a spreadsheet computer program incorporating a random number generator. By way 

10 of example the following Visual Basic macro, when run in Microsoft Excell™, will generate 
random sequences of between 10 and thirty bases in length. The T ni of these bases is then 
calculated using the simple algorithm T m = [2* (#A or T) + 4*(#G or Q] (i.e. each A or T 
base pair adds 2°C to the T m while a G or a C adds 4°C). The program then sorts the 
sequences in order of increasing T m . In this way oligos of substantially the same T m can be 

15 selected as candidate sequences for use as tails. It should be understood that the use of this 
T m algorithm is illustrative only and that any convenient algorithm could be used. 
Macro : 

Option Explicit 
Sub probe() 
20 'Declare arrays 

Dim well \ 

Dim number 

Dim repeat 

Dim percent 
25 'get number of sequences to generate 

Let number = Application. InputBox("Enter number of random sequences to generate") 

! redeclare arrays 

ReDim NextCell(30) 

ReDim Character(30) 
30 ReDim NoBases(number) 

ReDim NoGC(number) 
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ReDim PercentGC(number) 
ReDirn Tm(number) 
Dim limit 
Dim total 
5 'Generate Probe Sequences 
For repeat = 1 To number 
'determine sequence length 
Range("BBl").Select 

ActiveCelLFormulaRlCl = M =ABS(RAND())" 
10 Range("BBl").Select ; 

Let limit = (ActiveCell * 20) + 10 

'generate sequence 

For well = 1 To limit 

Range("Al H ).Select 
15 ActiveCell.Offset(repeat, well).FormulaRl CI = "=ABS(RAND()) M 

Range( H Al").Select 

ActiveCell. Offset(repeat, well).Select 

Selection. Copy 

Range("Al").Select 
20 ActiveCell. Offset(repeat, welI).Select 

Selection.PasteSpecial Paste:=xlValues, Operation :=xlNone, _ A 
SkipBlanks:=False, Transpose :=False 

Next well 

'convert to bases, detect G and C, total Tm 
25 Let NoBases(repeat) = 0 

Let NoGC(repeat) = 0 

Let Tm(repeat) = 0 

For well = 1 To limit 

Range("Ar).Seiect 
30 Let NextCell(well) = ActiveCell. Offset(repeat, well) 

If (NextCell(well) < 0.25 And NextCell(well) > 0#) Then 
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Let Character(well) = "A" 

Let NoBases(repeat) = NoBases(repeat) + 1 

Let Tm(repeat) = Tm(repeat) + 2 

End If 

5 Let NextCell(well) = ActiveCell.Offset(repeat, well) 

If (NextCell(well) < 0.5 And NextCell(well) > 0.25) Then 
Let Character(well) = "C" 
Let NoBases(repeat) = NoBases(repeat) + 1 
Let NoGC(repeat) = NoGC(repeat) + 1 
10 Let Tm(repeat) = Tm(repeat) + 4 
End If 

Let NextCell(well) = ActiveCell.Offset(repeat, well) 
If (NextCell(well) < 0.75 And NextCell(well) > 0.5) Then 
Let Character(well) = "G" 
15 Let NoBases(repeat) = NoBases(repeat) + 1 
Let NoGC(repeat) = NoGC(repeat) + 1 
Let Tm(repeat) = Tm(repeat) + 4 
End If 

Let NextCell(well) = ActiveCell.Offset(repeat, well) 
20 If NextCell(well) > 0.75 Then 

Let Character(well) = "T" 

Let NoBases(repeat) = NoBases(repeat) + 1 

Let Tm(repeat) = Tm(repeat) + 2 

End If 
25 Next well 

Range("Al").Select 

For well = I To limit 

ActiveCell.Offset(repeat, weIl).FormulaRlCl = Character(well) 
Next well 

30 Range("Al").Select 

ActiveCell.Offset(repeat, 0).FormulaRlCl = repeat 
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•calculate %GC 

Let PercentGC(repeat) = (NoGC(repeat) / NoBases(repeat)) * 100 
Next repeat 
'output calculations 
5 Range("AGl").Select 

For repeat = 1 To number 

ActiveCell.Offset(repeat, 0).FormulaRlCl = NoBases(repeat) 

Next repeat 

Range("AHl").Select 



10 For repeat = 1 To number 

ActiveCell.Offset(repeat, 0).FormulaRlCl = PercentGC(repeat) 

Next repeat 

Range("AIl").Select 

For repeat = 1 To number 
15 ActiveCell.Offset(repeat, 0).FormulaRlCl = Tm(repeat) 

Next repeat 

Columns("A:AF").Select 

Selection.Column Width = 4 

Active Window.SmallScroll ToRight.=l 7 
20 Columns("AH:AH").Select 

Selection.NumberFormat = "0.0" 

RangeO'AGl'O.Select 

ActiveCell.FormulaRlCl = "# bases" 

Range("AHl").Select 
25 ActiveCelLFormulaRlCl = "% GC" 

Range("AIl").Select 

ActiveCell.FormulaRlCl = "Tm °C" 

Range("Al").Select 

ActiveWindow.ScrollColumn = 1 
30 Columns("A:AI").Select 

Selection.Sort Keyl:=Range("A12"), Order 1 :=xl Ascending. Header:= 
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xlYes, OrderCustom:=l, MatchCase:=False, Orientation:=_ 

xlTopToBottom 
RangeC M Al").Select 
End Sub 

The design and choice of suitable tail sequences with substantially the same T m can of 
course also be carried out manually using the simple algorithm T m = [2* (#A or T) + 4*(#G or 
C)]. 



10 Example 3. 

A suitable format for differential expression studies. 

cDNA is prepared from mRNA purified from two types of tissue representing normal 
and altered cells. The altered cells may have been treated differently to the normal cells prior 
to mRNA purification. Examples of 'different treatment' include starving the cells of a 

15 metabolite or metabolites, stimulating them with a specific metabolite such as a growth factor 
or treating them with a drug or hormone. Alternatively, different cell conditions might exist 
already, i.e. normal Vs tumour. 

Typically, reverse transcription from an oligo-dT primer is carried out with 
incorporation of fluorescent labels into the cDNA prepared. Often one label such as Cy-3 is 

20 used to label the cDNA from the normal tissue and another label such as Cy-5 is used to label 
the cDNA from the treated tissue. The cDNA population is optionally fragmented by ~ 
sonication or mechanical shearing (i.e. by passage through a 19G needle). Targeting 
polynucleotides are added to the cDNA mixture and hybridisation permitted to occur, for 
example, under the following hybridisation condition: labelled cDNA is resuspended in 10 ml 

25 of 3.5 x SSC containing 4mg of poly dA DNA, 2.5 mg E. coli tRNA, 4mg human Cotl DNA 
and 1ml 10 10% SDS. Hybridisation is carried out at 62°C for 3hours. 

To facilitate quantitative analysis and ensure efficient target binding and identification, 
each cDNA is targeted by a number of distinct targeting polynucleotides each capable of 
hybridising to a different region of the cDNA, but all possessing the same tail sequence to 

30 facilitate capture at the same pre-determined location on the microarray. By way of example, 
each cDNA is targeted by 20 or more distinct targeting polynucleotides. As described above. 
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the targeting polynucleotides comprise two principal domains, the first domain is 
complementary to a distinct region of one of the cDNAs in the sample mixture and the second 
domain (tail portion) is complementary to one of the capture oligonucleotides on the DNA 
array. If the range of T m between the target sequences and their complementary sequences on 
5 the targeting polynucleotides is wide, it may be preferable to group the targeting 
polynucleotides into pools of substantially the same T m . In this way, separate 
sample:targeting polynucleotide hybridisations can be carried out under the optimum 
hybridisation conditions for each pool. All the pooled reaction products can then be mixed 
prior to the capture hybridisation. Once hybridisation between the cDNA and the targeting 



10 polynucleotides is substantially complete, the mixture is added to the surface of the 
oligonucleotide microarray under suitable conditions to allow hybridisation between the 
targeting polynucleotide tail portion to occur. Hybridisation is carried out at 62°C for 1- 
3hours in a suitable volume of hybridisation solution such as 10ml of 6x SSC, 0.1%SDS and 
0.25% dried skimmed milk (Marvel™) in a suitable enclosed vessel. A proprietary 

15 hybridisation apparatus such as model HB-1 (Techne Ltd) provides reproducible conditions 
for the experiment. On completion of hybridisation the microarray is subjected to a stringency 
wash (such as in 2xSSC, 0.2%SDS, then 0.2xSSC) and the array surface is subjected to 
fluorescence. Fluorescent output from the two dyes is captured and stored as separate 
channels. The intensity of the two data sets is normalised by reference to a common 

20 housekeeping gene whose expression is considered to be invariant in all tissues. There are 
many such genes but one example is GAPDH. 

Having normalised the data, the differences in intensity at each point on the microarray is 
measured. Up or down regulation of genes in the treated tissues would be seen as increases or 
decreases respectively in the intensity of the Cy-5 signal compared to the intensity of the same 

25 array spot in the Cy-3 channel. 



